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@ Construction of DNA sequence* and their use for microbial production of proteins, In particular human aerum albumin. 

© By means of reverse transcription of mRNA coding 
for a desired polypeptide, there Is obtained a sot of 
overlapping fragments of duplex cDNA ( which together 
correspond to the whole mRNA molecule. The fragments 
have overlapping regions bearing sites for restriction 
enzymes, such that cutting end ligation gives DNA 
corresponding to the polypeptide. This Is introduced Into 
8 vector In reading frame with a promoter. Transforma- 
tion of a microorganism enables expression of the poly- 
^ peptide. 

^ The construction via fragments enables large mo- 
lecules to be made. Thus, human serum albumin (HSA) 
^ Is produced by E. coll transformed with plesmld pHSAI. 
w This Includes DNA made from fragments derived from 
V reverse transcription of mRNA from human liver. 
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CONSTRUCTION OF DNA SEQUENCES AND THEIR USE 
FOR MICROBIAL PRODUCTION OF PROTEINS , IN 
PARTICULAR HUMAN SERUM ALBUMIN 

This invention relates to recombinant DNA technology. 
It particularly relates to the application of the technology/ 
to the production of human serum albumin (HSA) in micro- 
organisms for use in the therapeutic treatment of humans. 
In one aspect the invention relates to a technique for 
producing DNA sequences encoding desired polypeptides- In 
another aspect it relates to the construction of microbial 
expression vehicles containing DNA sequences encoding a 
protein, e.g. human serum albumin or the biologically active 
component thereof operably linked to expression effecting 
promoter systems and to the expression vehicles so 
constructed. In another aspect, the present invention 
relates to microorganisms transformed with 
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such expression vehicles, thus directed in the expression of the DNA 
sequences referred to above. In yet other aspects, this invention 
relates to the means and methods of converting the end product of such 
expression to entities, such as pharmaceutical compositions, useful for 
5 the therapeutic treatment of humans. In preferred embodiments, this 

invention provides for particular expression vectors that are sequenced 
properly such that mature human serum albumin is produced directly. 

In one aspect, the present invention is particularly directed to a retted of 
10 preparing cDNA encoding polypeptides or biologically active portions 
thereof. This aspect provides the means and methods of utilizing 
synthetic primer DMA corresponding to a portion of the mRHA of the 
intended polypeptide, adjacent to a known endonuclease restriction site, 
1n order to obtain by reverse transcription a series of OKA fragments 
15 encoding sequences of the polypeptide. These fragments are prepared such 
that the entire desired protein coding sequence is represented, the 
individual fragments containing overlapping DMA sequences harboring 
common endonuclease restriction sites within the corresponding 
overlapping sequence. This aspect facilitates the selective cleavage and 
20 ligation of the respective fragments so as to assemble the entire cONA 
sequence encoding the polypeptide In proper reading frame. This 
discovery permits the obtention of cDNA of high molecular weight proteins 
which otherwise may not be available through use of usual reverse 
transcriptase methods and/or chemical synthesis. 
25 / 

The publications and other materials hereof used to Illuminate the 
background of the invention, and in particular cases, to provide 
additional details respecting Its practice are Incorporated herein by 
reference, and for convenience, are numerically referenced in the 
30 following text end grouped in the appended bibliography. 
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Human Serum Albumin 

Human serum albumin (USA) is the major protein species in adult 
serum- It is produced in the liver and is largely responsible 
for maintaining normal osraolarity in the bloodstream and 
functions as a earner for numerous serum molecules (1. 2). 
The apparent fetal counterpart of HSA 1s o-fetoprote1n and 
studies have been undertaken to compare the two as well as rat 
serum albumin and o-fetoprotein (3-8). The complete protein 
sequence of HSA has been published (9-12). The published 
protein sequences of HSA disagree 1n about 20 residues as well 
as In the total number of amino adds In the mature protein 
[584 amino acids (9); 585 (12)]. Some evidence suggests that 
HSA 1s Initially synthesized as a precursor molecule (13.14) 
containing a 'prepro* sequence. The precursor forms of bovine 
(15) and rat (16) serum albumin have also been sequenced. 

The role or rationale for the use of albumin In therapeutic 
application is for the treatment of hypovolemia, 
hypoprotelnerala and shock. Albumin currently is used to 
improve the plasma oncotic (colloid osmotic) pressure, caused 
by solutes (colloids) which are not able to pass through 
capillary pores. Inasmuch as albumin has a low permeability 
constant, U essentially confines itself to the intravascular 
compartment. When different concentrations of nondlf fusable 
particles exist on opposite sides of the cell membrane, water 
crosses the partition until the concentrations of particles are 
equal on both sides. In this process of osmosis, albumin plays 
a vital role in maintaining the liquid content 1n blood. 



Thus, the therapeutic benefits of albumin administration reside 
primarily for the treatment of conditions where there is a loss 
of liquid from the intravascular compartment, such as in 
surgical operations, shock, burns, arid hypoproteineraia 
resulting in edema. Albumin Is also used for diagnostic 
applications in which its nonspecific ability to bind to other 
proteins makes it useful fn various diagnostic solutions. 

Presently, human serum albumin is produced from whole blood 
fractionation techniques, and thus is not available in large 
amounts at competitive costs. The application of recombinant 
DMA technology makes possible the production of copious amounts 
of human serum albumin by use of genetically directing 
microorganisms to produce it efficiently. The present 
invention may enable the availability of purified HSA 
produced through recombinant DMA technology more abundantly and 
at lower cost than 1s now presently possible. The present 
invention also provides knowledge of the ONA sequence 
organization of human serum albumin and its deduced amino acid 
sequence, helping to elucidate the evolutionary, regulatory, 
and functional properties of human serum albumin as well as its 
related proteins such as alpha-fetoprotein. 

More particularly, present invention provides for the isolation 
of cDHA clones spanning the entire sequence of the protein 
coding and 3' untranslated portions of HSA raRNA. These cDNA 
clones were used to construct a recombinant expression vehicle 
which directed the expression in a microorganism strain of the 
nature HSA protein under control of the trp promoter. The 
present invention also provides the complete nucleotide and 
deduced amino acid sequence of HSA. 



Reference herein to the expression of "mature human serum 
albumin* connotes the microbial production of human serun 
albumin unaccompanied by the presequence ("prepro") that 
immediately attends translation of the human serum albumin 
mRMA. Mature human serum albumin, according to the present 
invention, is immediately expressed from a translation sta-rt 
signal (ATG) , which also encodes the amino acid methionine 
linked to the first amino acid of albumin. This methionine 
amino acid can be naturally cleaved by the microorganism so as 
to prepare the human serum albumin directly. Mature human 
serum albumin can be expressed together with a conjugated 
protein other than the conventional leader, the conjugate being 
specifically cleavable in an intra- or extracellular 
environment. See British patent publication number 2007676A. 
Finally, the mature human serum albumin can be produced in 
conjunction with a microbial signal polypeptide which 
transports the conjugate to the cell wall, where the signal is 
processed away and the mature human serum albumin secreted. 

) Recombinant ONA Technology 

With the advent of recombinant ONA technology, the controlled 
microbial production of an enormous variety of useful 
polypeptides has become possible. Many mammalian polypeptides, 
such as human growth hormone and human and hybrid leukocyte 
interferons, have already been produced by various 
microorganisms. The power of the technology admits the 
microbial production of an enormous variety of useful 
polypeptides, putting within reach the microbially directed 
manufacture of hormones, emymes, antibodies, and vaccines 
useful for e variety of drug-targeting applications. 
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A basic element of recombinant OH A technology is the plasmid, 
an extrachroniosomal loop of double-stranded DNA found in 
bacteria oftentimes in multiple copies per cell. Included in 
the information encoded in the pi as mid DNA is that required to 
reproduce the plasmid in daughter cells (i.e., a "repllcon" or 
origin of replication) and ordinarily, one or more phenotypic 
selection characteristics, such as resistance to antibiotics, 
which permit clones of the host cell containing the plasmid of 
interest to be recognlied and preferentially grown in selective 
media. The utility of bacterial plasmids lies 1n the fact that 
they can be specifically cleaved by one or another restriction 
endonuclease or "restriction enzyme - , each of which recognizes 
a different site on the plasmid ONA. Thereafter heterologous 
genes or gene fragments may be inserted into the plasmid by 
endwise joining at the cleavage site or at reconstructed ends 
adjacent to the cleavage site. (As used herein, the term 
"heterologous 11 refers to a gene not ordinarily found in, or a 
polypeptide sequence ordinarily not produced by, a given 
microorganism, whereas the tern "homologous" refers to a gene 
or polypeptide which is found 1n, or produced by the 
corresponding wild- type microorganism.) Thus formed are 
so-called rcplicable expression vehicles. 



25 



30 



DNA recombination 1s performed outside the microorganism, and 
the resulting "recombinant 1 ' replicable expression vehicle, or 
plasmid, can be introduced into microorganisms by a process 
known as transformation and large quantities of the 
heterologous gene-ecnta1n1ng recombinant vehicle obtained by 
growing the transf crmant. Moreover, where the gene Is properly 
inserted with reference to portions of the plasmid which govern 
the transcription and translation of the encoded DMA message, 
the resulting expression vehicle can be used to actually 
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produce the polypeptide sequence for which the inserted gene 
codes, a process referred to as -expression. 

Expression is initiated in a DNA region known as the promoter. 

S In the transcription phase of expression, the DHA unwinds, 

exposing the sense coding strand of the OHA as a template for 
initiated synthesis of messenger RNA from the 5" to 3' end of 
the entire DNA sequence. The messenger RNA is, 1n turn, bound 
by r1bos~nes, where the messenger RNA 1s translated into a 

10 polypeptide chain having the amino acid sequence for which the 
OHA codes. Each amino acid 1s encoded by a nucleotide triplet 
or "codon* which collectively make up the "structural gene", 
i.e., that part of the DNA sequence which encodes -the amino 
add sequence of the expressed polypeptide product. 

15 

Translation 1s Initiated at a "start" signal (ordinarily ATG, 
which in the resulting messenger RNA becomes AUG). So-called 
stop codons, transcribed at "the end of the structural gene, 
signal the end of translation and, hence, the production of 
20 further amino acid units. The resulting product may be 

obtained by lysing the host cell and recovering the product by 
appropriate purification from other proteins. 

In practice, the use of recombinant ONA technology can express 
25 entirely heterologous polypeptides - so-called direct 

expression - or alternatively may express a heterologous 
polypeptide, fused to a portion of the amino acid sequence of a 
homologous polypeptide. In the latter cases, the intended 
bioactive product Is rendered bloinactive within the fused, 
30 homologous/heterolcgous polypeptide until it 1s cleaved in an 
extracellular environment. See Wetzel, American Scientist 68, 
664 (1980). 
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If recombinant ONA technology is to fully sustain its promise, 
systems must be devised which optimize expression of gene 
inserts, so that the intended polypeptide products can be made 
available in controlled environments and in high yields. 

5 

(C) State of the Art 

Sargent eta]_., in Proc . Hatl. Acad . Sci . (USA) 70. 243 (1981), 
describe the cloning of rat serum albumin messenger RNA in a 
10 series of recombinant DMA plasmids. This was done to determine 

the nucleotide sequences of the clones in order to stu<Jy the 
evolutionary hypothesis of the protein product. Thus, these 
workers made no attempt to assemble the cDNA fragments they 
prepared. 

15 

In Journal of Supremo! ecul a r Structure and Cellular 
Biochemistry . Supplement 5, 1981, Alan R. Liss, Inc. KY, 
Dugalczyk et aK report, In abstract form, their studies of the 
human gene for human serum albumin. They obtained cDHA 
20 fragments but there Is no evidence that 'these workers cloned or 

produced the fragments for ar\y purpose other than for studying 
the basic molecular biology of the o-fetoprotein and serum 
albumin genes. 



The present Invention Is based upon the discovery that recombinant DMA 
technology can be used to successfully and efficiently produce human 
serum albumin in direct form. The product is suitable for use 1n 
30 therapeutic treatment of human beings In need of supplementation of 

albumin. The product is produced by genetically directed microorganisms 
and thus the potential exists to prepare and isolate HSA in a more 



0073646 



-9- 



efficient manner than is presently possible by blood fractionation 
techniques. It is noteworthy that we have 
succeeded in of genetically directing a microorganism to produce a 
protein of enoroous length ~ 584 amino acids corresponding to an mRNA 
transcript upwards of about 2,000 bases. 

The present invention comprises the human serum albumin thus produced and 
the means and methods of its production. The present invention 1s 
further directed to replicable 0HA expression vehicles harboring gene 

0 sequences encoding HSA in directly expressible form. Further, the 

present invention is directed to microorganism strains transformed with 
the expression vehicles described above and to microbial cultures of such 
transformed strains, capable of producing HSA. In .still further aspects, 
the present invention 1s directed to various processes useful for 
15 preparing said HSA gene sequences, DNA expression vehicles, microorganism 
strains and cultures and to specific embodiments thereof. Still further, 
this Invention 1s directed to the preparation of cDNA sequences encoding 
polypeptides which are heterologous to the microorganism host, such as 
human serum albumin, utilizing synthetic DMA primer sequences 

20 corresponding in sequence to regions adjacent to known restriction 
endonuclease sites, such that individual fragments of cDNA can be 
prepared which overlap in the regions encoding the common restriction 
endonuclease sites. This embodiment enables the precise cleavage and 
ligation of the fragments so as to prepare the properly encoded DNA 

25 sequence for the Intended polypeptide. 



The work described herein involved the expression of human serum albumin 
30 (HSA) as a representative polypeptide which 1s heterologous to the 

microorganism employed as host. Likewise the work described involved use 
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of the microorganism E. coli K-12 strain 294 (end A, thi", hsr", 
^hsm*], as described in British Patent Publication Ho. 2055382 A. 
This strain has been deposited with the American Type Culture Collection, 
ATCC Accession No. 31446. 

The invention, in its roost preferred embodiments, is described with 
reference to £. coli , including not only strain Z. coli K-12 strain 290, 
defined above, but also other known E. coli strains such as E. coli B, 
£• co1 1 * 1776 and i. coll W 3110, or other microbial strains many of 
which are deposited and (potentially) available from recognized 
microorganism depository institutions, such as the American Type Culture 
Collection (ATCC)--cf. the ATCC catalogue listing. See also German 
Offenlegungsschrlft 2644432. These other microorganisms include, for 
example, Bacilli such as Bacillus subtil Is and other enterobacteri aceae 
among which can be mentioned as examples Salmonella tvphimurium and 
Serratia mareesans. utilizing plasmids that can replicate and express 
heterologous gene sequences therein. Yeast, such as Saccharomyces 
cerevi siae . may also be employed to advantage as host organism In the 
preparation of the interferon proteins hereof by expression of genes 
coding therefor under the control of a yeast promoter. (See the 
copending U.S. patent application of Hitzeman et aU . filed February 25, 
1901 (Attorney Docket Ho. 100/43), assignee Genentech, Inc. 11*1-, or the 
corresponding European Application 82300949.3 which are 
incorporated herein by reference. 

Preferred embodiments of the invention will now be described 
with reference to the accompanying drawings in which: 

Figs. 1A and B are diagrams for use in explaining the 
construction of plasmid pHSAl; 

Fie. 2 shows the i mmuncprec ipitat ion of bacterially 
synthesised HSAj and 

Fig. 3 shows the amino acid sequence of HSA and the 
corresponding DNA sequence. 
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In Fig.lA, the tap line represents the rrflNA coding for the nunan senrn 
albumin protein and below it the regions contained in the cDMA 
clones F-47, F-61 and 0-44 described further heroin. The 
Initial and final amino add codons of the mature HSA mRNA are 
Indicated by circled 1 and 585 respectively. Restriction 
endonudease' sites involved In the construction of pHSAl are 
shown by vertical lines. An approximate siie scale 1n 
nucleotides Is Included. 

Tie conpleted plasmid pHSAl is shown in Fig. IB, with HSA coding regions 
derived from cDNA clones shaded as 1n A). Selected restriction 
sites and terminal codons number 1 and 585 are indicated as 
above. The £. coll trp promoter-operator region 1s shown with 
an arrow representing the direction of transcription. G:C 
denotes an oligo dG:dC tail. The leftmost XbaJ site and the 
Initiation codon ATG were added synthetically. The 
tetracycline (Tc) and ampicillin (Ap] resistance genes in the 
pDR322 portion of pHSAl are indicated by a heavy line. 

Figure 2 depicts the 1mmunoprec1p1 tation of bacterial ly synthesized HSA. 

L* coll cells transformed with albumin expression plasmid pHSAl 
(lanes 4 and 5) or control plasraid plelFA25 (containing an 
interferon o gene in the identical expression vehicle; lanes 2, 3 
and 7) were grown In 35 S-ciethion1ne-supplemented media- Samples 
in lanes 2, 4 and 7 were induced for expression from the trp 
promoter 1n M9 media lacking tryptophan; samples in lanes 3 and 5 
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were grown in tryptophan-containi ng LD broth to repress the trp 
promoter. Each sample lane of the autoradiograph of the 
SDS-polyacryl amide gel presented here contains labeled protein 
iirounopreci pita ted from 0-75 ml of cells at a density of g Q » 1 . 
Lanes 1 and 6 contain radioactive protein standards (BRL) whose 
molecular weight in kilodaltons is indicated at the left. 
Bacterially synthesized HSA is seen in lane 4 comigrating with the 
68,000 d ^C-labeled bovine serum albumin standards. Increased 
production of serum albumin in the induced versus repressed culture 
of pHSAl represents higher levels of synthesis of plasmld encoded 
protein rather than a difference in 35 S-meth1on1ne pool specific 
activities for minimal versus rich media (data not shown). The 
sharp band at 60,000 d 1s an apparent artifact; this band is seen in 
both Induced and repressed pHSAl and control transforntants, and 
binds to pre immune (lane 7) as well as anti-HSA IgGs (lanes 2-5). 
The tainor 47,000 d band in lane 4 1s apparently plasmld encoded and 
nay represent a prematurely terminated form of bacterially 
synthesized HSA. 

Figure 3 depicts the nucleotide and amino acid sequence of human serum 
albumin. 

The DMA sequence of the mature protein coding and 3' untranslated 
regions of HSA mRNA were determined from the recombinant plasmid 
pHSAl. The DHA sequence of the prepro peptide coding and 5" 
untranslated regions were determined from the plasmid P-14 (see 
text). Predicted amino acids are included above the DNA sequence 
and are numbered from the first residue of the mature protein. The 
preceding 24 amino acids comprise the prepro peptide. The five 
amino acid residues which disagree with the protein sequence of HSA 
reported by both Dayhoff (9) and Houl on et al_. (12) are underlined. 
The above nucleotide sequence probably does not extend to the true 
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5' terminus of HSA cftNA. In the albumin direct expression plasmid 
pHSAl, the mature protein coding region is inraediately preceded by 
the C. coli trp promoter-operator-leader peptide n'bosome binding 
site (36, 37), an artificial Xbal site, and an artificial inltation 
5 codon ATG; the prepro region has been excised. The nucleotides 

preceding HSA codon no. 1 1n pHSAl read 5 1 -TCACGTAAAAAGGGTATCTAGATG. 

Detailed Description 

10 (A) Synthesis and Cloning of cDHA . Poly(A)4 RHA was prepared from 

quickly fro2en human liver samples obtained from biopsy or from 
cadaver donors by either ribonucleoslde-vanadyl complex (17) or 
guanidinlum tMocyanate (18) procedures. cDHA reactions were 
performed essentially as described In (19) employing as primers 
15 either oligo-deoxynucleotides prepared by the phosphotriester 

method (20) or oligo (<JT) 12 _ 18 (Collaborative Research). For 
typical cDNA reactions 25-35 yg of poTy(A)+ RMA and 40-80 pniol 
of oligonucleotide primer were heated at 90* for 5 minutes 1n 
50 mM NaCl. The reaction mixture was brought to final 
20 concentrations of 20 nrt Tris HC1 pH 8.3, 20 m KC1 , 8 raH 

MgCl 2 , 30 oM dithi othreitol, 1 nM dATP, dCTP. dGTP, dTTP 
(plus 3Z P-dCTP (Amersham) to follow recovery of product) and 
allowed to anneal at 42 - C for 5*. 100 units of AMY reverse 
transcriptase (BRL) were added and incubation continued at 42* 
25 for 45 minutes. Second strand DHA synthesis, SI treatment, 

size selection on polyacrylaralde gels, deoxy (C) tailing and 
annealing to pBR322 which was cleaved with Pstl and deoxy |G) 
tailed, were performed as previously described (21, 22). The 
annealed mixture was used to transform E. coll K-12 strain 294 
30 (23) by a published procedure (24). 
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Screoning of Recombinant Plasmids with 32 P-labelled Probes. 

L; cc ^^ transf oraants were grown on LB-agar plates containing 
Sug/nl tetracycline, transferred to nitrocellulose filter paper 
(Schleicher and Schuell, BA05) and tested by hybridization 
using a modification of the in situ colony screening procedure 
(25). 32 P-end labelled (26) ol igodeoxynucleotide fragments 
of from 12 to 16 nucleotides 1n length were used as direct 
hybridization probes, or 32 P-cONA probes were synthesized 
from RIIA using ollgo(dT) or ol igodeoxynucleotide primers (19). 
Filters were hybridised overnight 1n 5X Denhardt's solution 
(27), SxSSC, UxSSC«1.5M NaCl. 0.15T4 Ma Citrate) 50 nrt Ha 
phosphate pH 6.8, 20 pg/ml salmon spenn DNA at temperatures 
ranging from 4* to 42" and washed 1n salt concentrations 
varying from 1 to 0.2x5SC plus 0,1 percent SOS at temperatures 
ranging from 4* to 42* depending on the length of the 
32 P-1abelled probe (28). Dried filters were exposed to Kodak 
XR-Z X-ray film using DuPont Lightning-Plus intensifying 
. screens at -80*. 

(C) DNA Preparation and Restriction En2yrae Analysis . Plesraid DNA 
was prepared in either large scale (29) or small scale 
("rainiprep"; 30) quantities and cleaved by restriction 
endonucleases (Hew England Biolabs, 8RL) following 
manufacturers conditions. Slab gel electrophoresis conditions 
and electroelution of ONA fragments from gels have been 
described (31]. 

(0) DNA Sequencing . DNA sequencing was accomplished by both the 
method of Kaxam and Gilbert (26) utilizing end-labelled DNA 
fragments and by dideoxy chain termination (32) on single 
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stranded DNA from phage H13 oP7 subclones (33) utilizing 
synthetic oligonucleotide (20) primers. Each region was 
independently sequenced several times. 

Construction of 5* End of Albumin Gene for Direct Expression of 
HSA . 10 ug (-16 pmol) of the -1200 bp PstI insert of 
plasnid F-47 was boiled In H^O for 5 minutes and combined 
with 100 pool of 32 P-end labelled 5' primer 
(dATGGATGCACACAAG). The mixture was Quenched on ice and 
brought to a final volume of 120 pi of 6 rt4 Tris HC1 pH 7.5, 6 
tiA MgCl 2 , 60 gM MaCl , 0.5 HA dATP, dCTP, dGTP, dTTP at 0*. 
10 units of ONA polymerase I Klenow fragment (Boehringer- 
Hannheira) were added and the mixture incubated at 24" for 5 
hr. Following phenol/chl oroforra extraction, the product was 
digested with Hpall, electrophoresed in a 5 percent 
polyacryl amide gel, and the desired <50 bp fragment 
electrocuted. The single stranded overhang produced by Xbal 
digestion of the vector plasmid pLelF A25 (21) was filled in to 
produce blunt DNA ends by adding deoxynucleoslde triphosphates 
to 10 uM and 10 units DNA polymerase 1 Klenow fragment to the 
restriction endonucleese reaction mix and incubating at 12* for 
10 ainutes. Restriction endonudease fragments (0.1 - 1 ug 1n 
approximate molar equality) were annealed and ligated overnight 
at 12* in 20 nl of 50 nH Tris HC1 pH 7.6, 10 nM MgCl 2 , 0.1 oM 
EDTA, 5mM d1 thiothreltol, 1 mM rATP with 50 units T4 llgase 
(M.E. Blolabs). Further details of plasmid construction are 
discussed below. 

F) Protein Analysis - Two ml cultures of recombinant £. coll 
strains were grown In either LB or H9 media plus 5 pg/ml 
tetracycline to densities of A 55Q « 1.0, pelleted, washed, 
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repelleted, and suspended in 2 ml of ID or supplemented M9 (M9 
+ 0.2 percent glucose, 1 ug/rol thiamine, 20 pg/nl standard 
amino acids except methionine was 2 i»g/ml and tryptophan was 
excluded). Each growth medium also contained S ug/ml 
tetracycline and 100 M Ci 35 S-nethionine (MEM; 1200 Ci/mmol}. 
After I. hr incubation at 37*. bacteria were pelleted, free2C- 
thawed and re suspended in 200 pi 50 eft Trls HC1 pH 7.5, 0.12 art 
NaEDTA then placed on ice for 10 minutes following subsequent 
additions of lysozyme to 1 rag/ml, HP 40 to 0.2 percent, and MaCl 
to 0.35 M. The lysate was adjusted to 10 mH MgClg and 
incubated with £0 pg/ml OHase 1 (Hortht ngton) on ice for 30 
min. Insoluble material was removed by mild centrif ugation. 
Samples were iraniunoprecipitated with rabbit antt-HSA (Cappel 
Labs) and staphylococcal absorbent (Pansorbin; Cal Bloc hem) as 
described (34), and subjected to SOS poly acryl amide gel 
electrophoresis (35). 

) cDHA Cloning . Initial .cDMA clones primed with oligo (dT) were 
screened by colony hybridization with both total liver cDMA (to 
identify abundant RNA species containing clones) and with two 
32 P-1abelled cDIJAs primed from liver mRNA by two sets of four 
11 base oligodeoxynucleotides synthesl2ed to represent the 
possible coding variations for amino acids 546-549 and 294-297 
of HSA. Positive colonies never contained more than about the 
3' 1/2 of the protein coding region of the expected HSA mRNA 
sequence. (The longest of these recombinants was designated 
B-44.) Since existing procedures were unable to directly copy 
an mRNA of the expected size (-2000 bp), synthetic 
oligodeoxynucleotides were prepared to correspond to the 
antimessage strand at regions near the 5' extreme of B-44. 
From the nucleotide sequence of B-44, we constructed a 12 base 
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ol igodeoxynudeotide corresponding to amino acids 36S-373. 
This was used to prime cONA synthesis of liver nAHA and produce 
cDHA clones in pBR322 containing the 5' portion of the HSA 
s message while overlapping the existing 0-44 recombinant. 

5 Approximately 400 resulting clones were screened by colony 
hybridization with a 16 base ol igodeoxynudeotide fragment 
located slightly upstream in the mflNA sequence we had thus far 
determined. Approximately 40 percent of the colonies 
hybridized to both probes. Many of those colonies which failed 

10 to contain hybridizing pi astnids presumably resulted from RHA 
self -priming or priming with contaminating oligo (dT) during 
reverse transcription, or lost the 3' region containing the 
sequence used for screening. "Kinlprep" amounts of plasinid DI1A 
from hybridizing colonies were digested with Pstl . Three 

15 recombinant plasroids contained sufficiently large Inserts to 
code for the remaining 5 1 portion of the HSA message. Two of 
these (F-15 and F-47) contained the extreme 5' coding portion 
of the gene but failed to extend back to a Pstl site necessary 
for joining with B-44 to reform the complete albumin gene. 

20 Recombinant F-61 possessed this site but lacked the 5' extreme 
end. A three part reconstruction of the entire message 
sequence was possible employing restriction endonudease sites 
in common with the part length clones F-47, F-61 and B-44 
(Fig. 1). 

25 

An additional cONA clone extending further 5' was obtained by 
similar ol Igodeoxynudeotide primed cONA synthesis (from a 
primer corresponding to amino acid codons no. 175-179). 
Although not employed in the construction of the mature HSA 
30 expression plasinid, this cDMA clone (P-14) allowed 

determination of the DNA sequence of the H prepro" peptide 
coding and 5' non-coding regions of the HSA mRNA. 
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The nature HSA raRMA sequence was joined to a vector plasmid for 
direct expression of the mature protein in t. coli via the trp 
promoter-operator. The plasmid pLelF A25 directs the 
expression of human leukocyte interferon A (1FNo2) (21). It 
was digested with Xbal and the cleavage site "filled in" to 
produce blunt DMA ends with ONA polymerase I Klenow fragment 
and deoxynucleoside triphosphates. After subsequent digestion 
with Pstl . a 'vector"' fragment was gel purified that contained 
pBR322 sequences and a 300 bp fragment of the E. coli trp 
promoter, operator, and ribosome binding site of the trp leader 
peptide terminating In the artificially blunt ended Xbal 
cleavage site. A IS base ol Igodeoxy nucleotide was designed to 
contain the Initiation codon ATC followed by the 12 nucleotides 
coding for the first four amino acids of HSA as determined by 
ONA sequence analysis of clone F-47. In a process referred to 
as 'primer repair", the gene-containing Pstl fragment of F-47 
was denatured, annealed with excess 15-mer and reacted with DMA 
polymerase 1 Klenow fragment and deoxy nucleoside triphosphates. 
This reaction extends a new second strand downstream from the 
annealed oligonucleotide, degrades the single stranded DMA 
upstream of codon number one and then polymerizes upstream 
three nucleotides complementary to ATG. In addition, when this 
product is blunt-end ligated to the prepared vector fragment, 
its initial adenosine residue recreates en Xbal restriction 
site. Following the primer repair reaction, the DMA was 
digested with Hpall and a 450 bp fragment containing the 5* 
portion of the nature albumin gene was gel purified (see Fig. 
1). This ; f ragment was annealed and ligated to the vector 
fragment and to the gel isolated Hpall to Pstl portion of F-47 
and used to transform E. coli cell s. Diagnostic restriction 
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cnccnuclease dicests of plasmid minipreps identified the 
recombinant A-2G which contained the 5' portion of the nature 
albumin coding region ligated properly to the trp promoter- 
operator. For the final steps in assembly, the A-26 plasraid 

5 was digested with BglJI plus PstI and the -< kb fragment was 

gel purified. This was annealed and ligated to a 3S0 bp PstI, 
Oglll partial digestion fragment purified from F-61 and a 1000 
bp PstI fragment of B-44. Restriction endonuclease analysis of 
resulting transf onaants identified plasraids containing th* 

10 entire HSA coding sequence properly aligned for direct 

expression of the mature protein. One such recombinant plasmid 
was designated pHSAl. When E. coli containing pHSAl is grown 
In ninioal media lacking tryptophan, the cells produce a 
protein which specifically reacts with HSA antibodies and 

IS coaigrates with HSA 1n SOS polyacryl amide electrophoresis (Fig. 

2). No such protein 1s produced by identical recombinants 
grown in rich broth. Implying that production in Z. coli of the 
putative HSA protein is under control of the trp 
promoter-operator as designed. To insure the integrity of the 

20 HSA structural gene 1n the recombinant plasmid, pHSAl was 

subject to ONA sequence analysis. 

(H) DMA Sequence Analysis 

25 The albumin cDNA portion (and surrounding regions) of pHSAi 

were sequenced to completion by both the chemical degradation 
method of Maxam and Gilbert (26) and the dldeoxy chain 
termination procedure employing templates derived from single 
stranded M13 nP7 phage derivatives (32, 33). All nucleotides 

30 *ere sequenced at least twice. The ONA sequence is shown In 

Fig. 3 along with the predicted amino acid sequence of the HSA 
protein. The DIJA sequence farther 5' to the mature HSA coding 
region was also determined from the cDNA clone and is 

Included in Fig. 3. 
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OliA sequence analysis confirmed that the artifical initiation 
codon and the conplete mature HSA coding sequence directly 
follows the E. coli trp promoter- operator as desired. The ATG 
Initiator follows the putative E. coll ribosome binding 
5 sequence (36) of the trp leader peptide (37) by 9 nucleotides. 

Translation of the DMA sequence of pHSAl predicts a mature HSA 
protein of 585 amino adds. Various published protein 
sequences of HSA disagree at about 20 amino adds. The present 

10 sequence differs by eleven residues from Itoulon et aU (12), 

and by 28 residues from that reported in the Dayhoff catalogue 
(9) credited as arising primarily from Behrens et al^ (10) with 
contributions by HouTon et a_K (12). Most of these differences 
represent inversions of pairs of adjacent residues or 

15 glutanrine-glutamic acid disagreements. Only at five of the 585 

residues does our sequence differ from the residue reported by 
both Dayhoff (9) and Moulon et a_K (12), and three of these 
five differences represent glutamine-glutaraic add interchanges 
(underlined 1n Figure 3). At all discrepant positions the 

20 nucleotide sequencing has been carefully rechecJced and it 1s 

unlikely that OKA sequencing errors are the cause of these 
reported differences. The possibility of artifacts introduced 
by cDNA cloning cannot be ruled out. However, other likely 
explanations exist for the amino acid sequence differences 

25 among various reports. These include changes in araidation 

(affecting glutamine- glutamic acid discrimination) occurring 
either in vivo or during protein sequencing (38). Polymorphism 
in HSA proteins may also account for some differences; over 
twenty genetic variants of HSA have been detected by protein 

30 electrophoresis (39) but have not yet been analyzed at the 
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amino acid sequence level. It is also worth noting that our 
predicted HSA protein sequence is 585 amino acids long, in 
agreement with Moulon (12) but not Oayhoff (9). The difference 
Is accounted for by the deletion (In nef. 9) of one 
phenylalanine (Phe) residue in a Phe-Phe pair at amine acids 
156-157. 



When compared to the ONA sequence of a rat serua albumin cDHA 
clone (16) the present mature HSA sequence shares 74 percent 
homology at the nucleotide and 73 percent homology at the amino 
acid level. (The rat SA protein 1s one amino acid shorter than 
HSA; the carbo*y terminal residue of HSA Is absent in the rat 
protein.) All 35 cysteine residues are located In identical 
positions 1n both proteins. The predicted "prepro" peptide 
region of HSA shares 76 percent nucleotide and 75 percent amino 
acid homology with that reported from the rat cDWA clone (16). 
Interspecies sequence homology 1s reduced In the portion of the 
3* untranslated region which can be compared (the published rat 
cDNA clone ends before the 3' mRNA terminus). The HSA cDHA 
contains the hexanucleotlde AATAAA 26 nucleotides before the 
site of poly(A) addition. This Is a common feature of 
eukaryotlc raRNAs first noted by Proudfoot and Brownlee (40). 

Pharmaceutical Compositions 

The compounds of the present Invention can be formulated according to 
known methods to prepare pharmaceutical ly useful compositions, whereby 
the polypeptide hereof is combined 1n admixture with a pharmaceutic ally 
acceptable carrier vehicle. Suitable vehicles and their formulation ere 
described 1n Remingtons Pharmaceutical Sciences by E.U. Martin, which Is 
hereby incorporated by reference. Such compositions will contain an 
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effective amount of the protein hereof together with a suitable amount of 
vehicle in order to prepare pharmaceutical^ acceptable compositions 
suitable for effective administration to the host. One preferred mode of 
administration is parenteral- 
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CLAIMS : 



1. A method of constructing a DNA sequence encoding a 

polypeptide comprising a functional protein or a bioactive 
portion thereof, said DNA sequence being designed for 
insertion together with appropriately positioned transla- 
tional start and stop signals into an expression vector 
under the control of a microbially operable promoter, 
comprising the steps of : 

(a) providing messenger RNA comprising the entire 
coding sequence of said polypeptide, 

(b) obtaining by reverse transcription from the 
messenger RNA of step (a) a series of fragments of 
double stranded cDNA, each of said fragments 
corresponding in sequence to a portion of said 
coding sequence and thus encoding a portion of 
said polypeptide, wherein said fragments overlap 
in sequence at the respective terminal regions 
thereof, the overlapping portions thereof 
containing common restriction endonuclease sites, 
said fragments in totality comprising the entire 
coding sequence of said polypeptide, 

(c) cleaving the fragments of step (b) so as to 
prepare corresponding fragments which, when 
properly ligated, encode said polypeptide, and 

(d) ligating the fragments obtained from step (c). 
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2. A method of constructing a vector for use in 
expressing a polypeptide comprising performing the method of 
claim 1 to produce a product comprising the entire coding 
sequence of said polypeptide, and introducing the product 
into a vector under proper reading frame control of an 
expression promoter. 

3. The method according to claim 1 or 2 wherein said 
polypeptide comprises the amino acid sequence of human serum 
albumin. 

4. The method according to claim 3 wherein the poly- 
peptide contains a cleavable conjugate or microbial signal 
protein attached to the N-terminus of the ordinarily first 
amino acid of said human serum albumin. 

5. The method according to claim 4 wherein said cleavable 
conjugate is the amino acid methionine. 

6. A method according to any preceding claim wherein said 
DNA sequence is the gene encoding human serum albumin. 

7. A DNA sequence consisting essentially of a sequence 
encoding human serum albumin. 

8. A DNA sequence according to claim 7 operably linked 
with a DNA vector capable of effecting the microbial 
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expression of said sequence so as to prepare the corres- 
ponding human serum albumin. 

9- A replicable microbial expression vehicle capable, in 

a transformant microorganism, of expressing the DNA sequence 
according to claim 7 . 

10. A microorganism transformed with the vehicle according 
to claim 9 . 

11. A fermentation culture comprising a transformed 
microorganism according to claim 10. 

12. The microorganism according to claim 10, obtained by 
transforming an E. coli bacterial or a yeast strain. 

13. The plasmid pHSAl . 

14. An E. coli bacterial strain transformed with the 
plasmid according to claim 13. 

15. A process which comprises microbially expressing human 
. serum albumin in mature form. 

16. The use of human serum albumin prepared by the process 
of claim 15 for therapeutic treatment of humans or for 
preparing pharmaceutical compositions useful for such 
treatment . 
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ACWTIKTCT&CGMTTTLATATAACTATTTTTTU^ 

( Precro) 

**t Lyi Trp Ul Thr to lit Str Ln Its Phe Leo to Str Str All Tyr Str Are Cly Val to Arg Arg 
ACA ATS AM TSC CTA ICC TTT ATT TtC CTT CTT TTT CTC TTT ACC TCC SCT TIT TCC Aft CCT CTS TTT CST CSA 

lOuts*} 

Axp Ala tits LjS Str Clo til AW His Arg to lys Alp Lio filjr Gig 61 u Asn Km lys Ala ltu Val Ltd Ht 
GAT CCA CAC AAC AST SAC CTT SCT CAT CSC TTT AAA CAT TTC EGA SAA SAA AAT TTC AAA CCC TTS GTS TTS ATT 

50 

All to Ala tin Tyr ltu Cln Clo Cys Pro Phi GU Asp His Val lys ltu Ul Asn Clu Val Thr do Phi All 
OCX TTT CCT CAC 7A1 CTT (AC CAS TCI CCA TTT GAA SAT CAT CTA AAA T7A CTS AAT CAA STA ACT SAA TTT 6CA 

Lys Thr Cys Val All Asp Clu Sar Ala 61« Aba Cys Asp lys Str Leu M1i TV Leu to CI; Asp Lyi Leu Cys 
AAA ALA TCT CTA CCT SAT SAC TLA SCT CAA AAT TCT CAC AAA TCA CTT CAT ACC CTT TTT SSA SAC AAA TTA TSC 

too 

Thr Val Ala Thr ltu in CU Thr Tyr Cly Clu Pht All At? C« Cm Ala lys Sift Sill Pro Clu Art, Asn Sit 
ACA CTT CCA ACT en CGi SAA ACC TAT CCT SAA ATC CCT CAC TSC TCT CCA AAA CAA SAA CCT SAC AGA AAT SAA 

Crs to In STn H1i lys Asp Asp Asn Pro Asn ltu Pro Arg Ltg Til Arg Pro C1b V«l Asp Val Att Cys Thr 
TSC TTC TTC CAA CAC AAA SAT SAC AAC CCA AAC CTC CCC CSA TT6 CTC ASA CCA CAS STT SAT ST6 ATS TSC ACT 

150 

Ala to His Asp Asn C1t> Clg Thr to Lao Lys lys Tyr ltu Tyr Clu lit AU Art Arg His Pre Tyr to Tyr 
SCT TTT CAT CAC AAT CAA SAG ACA TTT TT6 AAA AAA 1AC TTA TAT SAA ATT SCC AGA ASA CAT CCT TAC TTT TAT 

All Pro Clg Lib Ltg to to AU tys Are Tyr lys Alt Alt to Thr Clu Cys Cys Sit Alt Ala Asp Lys Alt 
SCC CCC SAA CTC CTT TTC TTT SCT AAA AK TAT AAA SCT SCT TTT ACA SAA TCT TGC CAA SCT SCT SAT AAA SCT 

200 

All Cys Leu Lw Pro Lys ltu Ast Slu Ltc Aro As Ely Cly Lys Alt Str Str Ala Lys CI a Aro ltu lys Cys 
SCC TCC CTC TTS CCA AAC CTC SAT SAA CTT CSS SAT CAA SGG AAC CCT TCC TCT CCC AAA CAC ASA CTC AAA TCT 

All Str Ltu Sit Lys to Cly Slu Aro Alt to Lys Alt Trp Ala Til Ala Arc LtQ Str Clt Arg to Pro Lys 
CCC AST CTC CAA AAA TTT CSA SAA ASA SCT TTC AAA CCA TCC CCA CTC CCT CCC CTC ACC CAC ASA TTT CSC AAA 

250 

Alt Clu to Alt Slit Tal Str Lys Lm til Thr Asp Lm Thr- Lys Val His Thr Clo Cys Cys His Sly Asp Ltu 
SCT CAC TTT CCA CAA STT TCC AAS TTA CTS ACA SAT CTT ACC AAA CTC CAC ACS CAA TSC TSC CAT SSA CAT CTC 

Lm Clu Cys Ala Asp A* Arg Ala Asp Ltu Ala Lyi Tyr lit Cys Clu Asa Cln Asp Str lit Str Str Lys Ltu 
CTT SAA TST SCT SAT SAC ASS SCC SAC CTT CCC AAC TAT ATC TCT CAA AAT CAS CAT TCC ATC TCC ACT AAA CTS 

300 

Lys Clu Cys Cys Glu lys Pro Ltu Ltu Gig Lys Str His Cys He Ala C1b Val Clg Asn Asp Slu P*t Pro Alt 
AAC CAA TSC TCT SAA AAA CCT CTS TTS SAA AAA TCC CAC TGC ATT SCC SAA CTS SAA AAT SAT SAS ATS CCT SCT 

Asp Ltu Fro S«r Lm Ala Ala Asp to Val Sit Str Lys As? Val Cys Lys Asn Tyr Ala Slu Alt Lys Asp Val 
SAC TTS CCT TCA TTA SCT CCT GAT TTT CTT SAA AST AAS SAT CTT TCC AAA AAC TAT SCT SAS 6CA AAC SAT STC 

350 

Phe Lev Sly Hrt to Lev Tyr Clu Tyr Ala Arg Arg Ms Pro Asp Tyr Str Val Val Ltu Ltu Ltu Arg Law Alt 
TTC CTS SCC ATC TTT TTS TAT CAA TAT CCA ACA AGE CAT CCT SAT TAC TCT STC STC CTS CTS CTC ASA CTT SCC 

Lys Thr Tyr Slu Thr Thr Ltu Slu Lys Cys Cys Ala Alt All Asp Pro Hli Slu Cys Tyr Alt Lys Tal Phe Asp 
AAfi ACA TAT SAA ACC ACT CTA SAC AAS TSC TCT CCC SCT CCA SAT CCT CAT CAA TCC TAT SCC AAA STS TTC SAT 

too 

Glu P(a Lys Pro Ltu Val Clu Clu Pro Sin Asa Ltu lit Lys Sin Asn Cys Slu Ltu Pht Lys Slo Ltu Cly Clu 
SAA TTT AAA CCT CTT CTC CAA SAC CCT CAC AAT TTA ATC AAA CAA AAC TCT SAS CTT TTT AAC CAC CTT C6A SAG 

Tyr Lys to Sin Asn Ala Ltg Ltu Val Arg Tyr Thr Lys Lys Val Pro Sin Val Str Thr Pre Thr Liu Val Slu 
TAC AAA TTC CAC AAT SCC CTA TTA CTT C6T TAC ACC AAC AAA CTA CCC CAA CTC TCA ACT CCA ACT CTT CTA SAS 

450 

Vil Str Arg Asn Leo Sly Lys Vtl Sly Str Lys Cys Cys Lys Bis Pro Slu Ala Lys Arg Part Pro Cyi Alt Slu 
CTC TCA AGA AAC CTA CSA AAA CTC SCC ACC AAA TCT TCT AAA CAT CCT SAA CCA AAA ASA ATS CCC TST CCA SAA 

Asp Tyr Leu Str Val Val Lao Asn Sit Ltu Cys Vtl Ltu His Slu Lys Thr Pro Val Str Asp Arg Vtl Thr Lyi 
SAC TAT CTA TCC CTC CTC CTC AAC CAS TTA TCT STC TTS CAT SAC AAA ACS CCA STA AST SAC ASA STC ACA AAA 

500 

Cys Cys Thr Slu Str Ltu Vtl Asn Arg Arg Pro Cys Pht Str Ala Ltu Clu Val Asp Clg Thr Tyr Val Pro Lys 
TSC TCC ALA SAC TCC TTC STC AAC AGS CSA CCA TSC TTT TCA SCT CTC SAA CTC CAT 6AA ACA TAC STT CCC AAA 

Glu Phe Asn Ala Gle Thr to Thr to His Ala Asp lit Cys Thr Ltu Ser Clo Lys Clu Arg Cln lit Lys Lys 
SAG TTT AAT SCT CAA ACA TTC ACC TTC CAT CCA SAT ATA TSC ACA CTT TCT SAC AAG SAG AGA CAA ATC AAS AAA 

550 

Cln Thr Ala ltu Val Clu Ltu Val Lys His Lys Pro Lys Ala Thr Lys Clu Cln Ltu Lys Alt Val Part Asp Asp 
CAA ACT CCA CTT GTT SAC CTT CTC AAA CAC AAfi CCC AAS CCA ACA AAA GAG CAA CTS AAA SCT STT ATS SAT SAT 

Ph. Ala A1« to Val Slu Lys Cys Cys Lys Ala Aso A so Lys Clu Thr Cys Phe Alt Clu Slu Sly lyi Lys Ltu 
TTC CCA SCT TTT CTA SAG AAC TCC TGC AAG SCT SAC SAT AAC SAS ACC TGC TTT SCC SAG SAS GST AAA AAA CTT 

Val Ala Ala Str Slo Ala Ala Ltu Sly Ltu End 

GTT CCT GCA ACT CAA SCT SCC TTA SGC TTA TAA CATC TACATTT AAAAGCATC TCAGCCT ACCATCAGMTAAGAGAAAGAAMTSAA 
SATLAAAAgTTATTCATCTC I 1 1 ICMII Tt CTTt5T67AAA6CCAA£XCXT^TITAAiUUlACATAAAT1 T U n AATCATTrTSCCTCTTTTCTCT 
CTStTTLAATT AAT AAAAAATGCaAAGAATCTAATACAGTGCT ACAGCACTSnATTTTTLAMSATST&TTSnATCCTS 

TSSAACTTCCAGTSTTCTCTCTTATTCCACTTCttTAGAGSATnCTK PolyCAl 
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